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Abstract 

Efficient  modeling  of  electromagnetic  scattering  has  always  been  an  active  topic  in  the  field  of  computational 
electromagnetics.  To  reduce  the  memory  and  CPU  time  in  the  method  of  moments  (MoM)  solution,  an  efficient 
method  based  on  pseudo  skeleton  approximation  is  presented  in  this  report.  The  algorithm  is  purely  algebraic,  and 
therefore  its  performance  is  not  associated  with  the  kernel  functions  in  the  integral  equations.  The  algorithm  starts 
with  a  multilevel  partitioning  of  the  computational  domain,  which  is  very  similar  to  the  technique  employed  in 
multilevel  fast  multipole  algorithm  (MLFMA).  Any  of  the  impedance  sub-matrices  (with  size  of  to  x  n)  associated 
with  the  well-separated  partitioning  clusters  (far  interaction  terms)  is  represented  by  the  product  of  two  much 
smaller  matrices  (with  sizes  of  to  x  r  and  r  x  n),  where  r  is  the  effective  rank.  Therefore,  the  memory  requirement 
will  be  relieved  and  the  total  CPU  time  will  be  reduced  significantly  as  well,  since  the  rank  is  much  smaller  than  the 
original  matrix  dimensions.  It  should  be  noted  that  we  don’t  have  to  calculate  all  the  impedance  entries  to  implement 
the  aforementioned  decomposition.  Instead,  we  only  need  to  calculate  a  few  randomly  chosen  rows  and  columns 
of  those  impedance  entries.  Further  compressions  based  on  singular  value  decomposition  (SVD)  are  performed  so 
that  the  rank  reaches  its  optimal  limit,  which  leads  to  the  optimized  final  matrix  compression.  Numerical  examples 
are  provided  to  show  the  validity  of  the  new  algorithm.  Future  work  directions  are  also  discussed  in  this  report. 

Index  Terms 

Pseudo  skeleton  approximation,  adaptive  cross  approximation,  low  rank  matrix  approximation,  singular  value 
decomposition,  electromagnetic  scattering,  sea  surface  scattering 
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I.  Introduction 

Method  of  moments  (MoM)  [1]  has  been  a  very  popular  approach  in  solving  electromagnetic  scattering 
problems.  However,  it  has  also  raised  challenging  issues  since  it  suffers  from  the  memory  requirement  for 
a  large  dense  impedance  matrix  and  computational  complexity  for  large-scale  problems.  It  is  well  known 
that  significant  progress  has  been  made  in  employing  the  fast  multipole  method  (FMM)  [2],  [3],  [4], 

[5] ,  to  relieve  the  aforementioned  limitations.  For  example,  multilevel  fast  multipole  algorithm  (MLFMA) 

[6] ,  [7]  incorporated  with  the  iterative  techniques  can  reduce  the  numerical  complexity  to  0(N  log  N)  to 
solve  integral  equations.  However,  one  of  the  major  disadvantages  of  this  approach  is  that  the  algorithm 
is  NOT  independent  of  the  integral  equation  kernel.  That  is,  for  integral  equations  with  different  kernels, 
one  has  to  make  appropriate  modifications  [7]  to  implement  the  fast  algorithm.  Another  approach  to  the 
compression  of  operators  is  based  on  wavelets  [8],  [9],  which  exploit  the  smoothness  of  the  elements  of 
the  matrix  viewed  as  a  function  of  their  indices  and  tends  to  fail  for  highly  oscillatory  operators. 

On  the  other  hand,  approaches  based  on  low-rank  representation  of  impedance  matrix  blocks  have 
been  also  introduced  in  the  field  of  computational  electromagnetics.  These  include  the  so-called  IES3 
[10],  IE-QR  [11],  [12],  PILOT  (predetermined  interaction  list  oct-tree)  [13],  and  ACA  (adaptive  cross 
approximation)  algorithms  [14],  [15].  In  these  methods,  all  the  impedance  matrix  blocks  (assume  that  the 
size  of  the  matrix  is  m  x  n)  associated  with  the  far  interaction  terms  are  represented  by  a  product  of  two 
much  smaller  matrices  with  sizes  of  m  x  r  and  r  x  n,  where  r  is  the  effective  rank  of  the  matrix  which 
can  be  determined  numerically.  Thus,  the  memory  requirement  can  be  reduced  from  mxn  to  r  x  (m  +  n), 
and  the  same  ratio  of  CPU  time  saving  can  be  obtained  for  matrix-vector  product.  The  beauty  of  these 
algorithms  is  their  purely  algebraic  nature.  That  is,  the  computational  speed-up  is  achieved  by  employing 
linear  algebra  manipulations  of  the  impedance  matrix.  Thus,  the  implementations  of  these  algorithms  do  not 
depend  on  the  complete  knowledge  of  the  integral  equation  kernels.  However,  the  algorithms’  complexity 
is  0(r2(m  +  n)),  which  is  not  trivial  when  m  or  n  is  big  enough.  And  the  other  fatal  limitation  of  these 
algorithms  is  that  they  are  not  stable  when  the  rank  of  the  matrix  is  big  according  to  our  experience. 

It  should  be  noted  that  randomized  algorithms  [16],  [17]can  also  be  used  for  the  construction  of  low- 
rank  approximation  to  matrices.  The  idea  is  that  a  randomized  matrix  is  employed  to  project  the  low-rank 
matrix  to  a  much  smaller  space.  Thus,  a  new  matrix  with  much  smaller  dimensions  is  obtained.  Then  any 
low-rank  decompositon  technique  can  be  used  to  get  the  appropriate  basis  functions.  And  the  corresponding 
coefficient  matrix  can  be  calculated  straight  forward  once  the  basis  functions  are  known.  Unfortunately, 
all  the  entries  of  the  low-rank  matrix  are  needed  for  this  method. 

In  this  report,  the  pseudo  skeleton  approximation  method  [18]  is  employed  for  the  purpose  of  matrix 
decomposition.  There  are  several  advantages  of  the  algorithm  compared  with  the  aforementioned  algo¬ 
rithms.  First,  the  algorithm  is  very  stable,  which  is  very  important;  Secondly,  its  computational  complexity 
is  0(r3),  which  is  independent  of  m  or  n. 

The  report  is  organized  as  follows:  First  in  Section  II  the  pseudo  skeleton  approximation  algorithm 
is  introduced,  and  a  technique  based  on  singular  value  decomposition  (SVD)  [19]  is  employed  for 
further  matrix  compression.  Then  some  numerical  results  are  presented  in  Section  III  to  demonstrate 
the  performance  of  the  current  approach,  followed  by  the  discussions  and  future  work  directions. 

II.  Efficient  Low-Rank  Matrix  Decomposition  Using  Pseudo  Skeleton  Approximation 

In  the  field  of  computational  electromagnetics,  and  many  other  areas,  one  often  encounters  matrices 
whose  ranks  are  much  lower  than  their  dimensionalities  (rank  deficient).  Discretization  of  integral  equations 
almost  always  results  in  matrices  of  this  type.  For  such  kind  of  matrices,  one  is  tempted  to  ’’compress” 
the  matrices  in  question  so  that  they  could  be  efficiently  applied  to  arbitrary  vectors,  and  at  the  same  time 
the  storage  requirement  can  also  be  reduced  (compressed)  as  well. 

It  should  be  noted  that  the  entire  impedance  matrix  obtained  through  MoM  is  neither  singular  nor  rank 
deficient  except  at  the  internal  resonances.  However,  if  all  the  unknowns  are  grouped  into  clusters  like  in 
MLFMA,  then  all  the  sub-matrix  blocks  representing  the  interactions  between  two  well-separated  clusters 
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(associated  with  the  far  interaction  terms)  are  rank  deficient  and  they  can  be  compressed  efficiently  by 
using  the  pseudo  skeleton  approximation  method  described  below.  As  in  MLFMA,  all  the  diagonal  sub¬ 
matrix  blocks  (associated  with  self  interactions)  as  well  as  all  the  sub-matrix  blocks  associated  with  the 
interactions  of  any  neighboring  clusters  are  to  be  calculated  directly  via  the  conventional  MoM  approach. 

In  this  Section,  we  will  first  give  a  brief  introduction  of  the  low-rank  approximation.  Then  details  of 
the  pseudo  skeleton  approximation  will  be  presented. 

A.  Outline  of  low-rank  matrix  compression 

Without  loss  of  generality,  assume  that  the  size  of  a  low-rank  matrix  A  (representing  interactions 
between  two  well-separated  clusters)  is  m  x  n.  Our  main  aim  is  to  decompose  the  matrix  A  into  two 
much  smaller  sub-matrices  U  and  V.  The  original  matrix  A  can  be  reconstructed  through  a  product  of  U 
and  V.  Namely, 

A(m  x  n)  ~  U(m  x  r)V(r  x  n )  (1) 

where  r  is  the  effective  rank  of  matrix  A. 

Instead  of  storing  mxn  impedance  entries  in  the  conventional  method,  the  above  low-rank  compression 
technique  only  requires  to  store  r  x  (m  +  n)  impedance  entries. 

Similarly,  when  the  matrix  A  is  directly  applied  to  a  vector,  the  computational  complexity  is  mxn.  In 
contrast,  we  can  first  apply  matrices  V  and  then  U  to  the  vector  in  sequence,  the  associated  complexity 
will  be  r  x  (m  +  n) . 

For  the  electromagnetic  problems,  generally  r  <C  min(m,  n).  Therefore,  much  CPU  time  will  be  reduced 
and  the  memory  requirement  will  be  relieved  significantly  as  well. 

It  has  been  proofed  theoretically  that  SVD  would  find  the  best  decomposition  with  given  rank.  In 
other  words,  for  a  given  accuracy,  SVD  will  find  the  associated  lowest  rank.  However,  the  algorithm  is 
very  expensive,  especially  when  the  matrix  dimensions  are  big.  Direct  application  of  QR  factorization  has 
similar  drawback.  Therefore,  in  this  report  pseudo  skeleton  approximation  will  be  employed  to  decompose 
matrix  A  into  U  and  V. 

B.  Pseudo  skeleton  approximation 

Before  going  to  the  details  of  the  pseudo  skeleton  approximation,  we  first  review  the  skeleton  approx¬ 
imation  method. 

Assume  that  the  rank  of  the  matrix  A  is  r.  Then  there  exists  a  nonsingular  r  x  r  submatrix  A  in  A. 
Denote  the  columns  and  rows  of  A  containing  the  submatrix  A  by  C  (with  size  of  m  x  r)  and  R  (with 
size  of  r  x  n),  respectively.  That  is,  submatrix  A  is  the  intersection  of  C  and  R.  Then  it  is  easy  to  verify 
that 

A(m  x  n)  ~  C(m  x  r)A_1(r  x  r)R(r  x  n )  (2) 

This  decomposition  is  known  as  a  skeleton  approximation  of  A. 

The  problem  with  skeleton  approximation  is  that  one  should  identify  which  columns  and  rows  should 
be  chosen.  Random  selection  will  lead  to  a  singular  A,  thus  not  enough  bases  are  embedded  in  those 
columns  and  rows,  and  the  inverse  will  not  be  available,  which  results  in  a  fail  decomposition. 

ACA  can  be  applied  to  get  the  above  decomposition  adaptively.  That  is,  columns  of  C  and  rows  of 
R  are  iteratively  added  until  an  error  criterion  is  reached.  However,  as  mentioned  before,  ACA  is  still 
expensive  in  the  sense  of  complexity,  and  it  is  not  stable  when  the  rank  of  matrix  is  big  enough. 

Actually,  the  matrix  can  be  approximated  by 

A(m  x  n)  ~  C(m  x  r)G(r  x  r)R(r  x  n) 


(3) 
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where  G  is  not  necessary  equal  to  the  inverse  of  A  and  even  not  necessarily  nonsingular.  For  example, 
G  can  be  chosen  as  the  pseudo  inverse  of  A.  This  kind  of  decomposition  is  called  the  pseudo  skeleton 
approximation. 

Once  the  matrices  C,  G,  and  R  are  obtained,  one  can  easily  obtained  the  matrices  U  and  V  defined  in 

(1): 

U  =  C  (4) 

V  =  GR  (5) 

Or  we  can  have: 

U  =  CG  (6) 

V  =  R  (7) 

Unlike  using  an  iterative  approach  in  ACA  to  find  the  columns  of  C  and  rows  of  R,  here  we  just 
randomly  choose  k  columns  and  k  rows  from  A,  where  k  is  a  number  large  enough  so  that  r  most 
important  bases  will  be  embedded  in  both  column  and  row  sub-matrices.  Numerical  experiments  show 
that  k  =  3r  is  good  enough  to  obtain  excellent  results. 

The  complexity  of  computing  the  pseudo  inverse  of  A  is  0(r3),  which  is  much  less  compared  to  ACA 
considering  the  fact  that  r  -C  min  (m,  n). 

C.  Further  compression  of  the  matrices 

It  should  be  noted  that  the  dimensions  of  the  matrices  C,  R,  and  G  are  m  x  k,  k  x  n,  and  k  x  k, 
respectively.  This  guarantees  that  enough  information  is  included  in  the  matrices  C  and  R.  Therefore,  the 
sizes  of  the  associated  matrices  U  and  V  are  m  x  k  and  k  x  n,  respectively. 

U  and  V  must  contain  much  redundancies  since  the  rank  (r)  of  the  matrix  A  is  smaller  than  k.  On  the 
other  hand,  the  vectors  inside  U  and  V  are  generally  not  orthogonal.  To  remove  the  redundancies,  both 
QR  factorization  and  SVD  can  be  employed  here. 

First  employing  QR  factorization  to  decompose  the  matrix  U  and  the  adjoint  of  V,  we  have: 


U(mx  k)  —  Qu  (rn  x  k)Ru{k  x  k )  (8) 

V'(n  x  k)  =  Qv(n  x  k)Rv{k  x  k)  (9) 

Then  SVD  is  employed  to  decompose  the  product  of  matrices  Ru(k  x  k )  and  R'v{k  x  k): 

U(k  x  r)S(r  x  r)V(r  x  k)  =  RuR'v(k  x  k)  (10) 

During  this  step,  the  effective  rank  r  of  the  matrix  A  is  determined: 

r  =  sum(\diag(S)\  >  tol  ■  (S^l,  1) |)  (11) 

where  tol  is  the  relative  tolerance.  Generally,  tol  is  chosen  to  be  Hr  3. 

Hence  the  final  version  of  the  decomposition  of  matrix  A  is  as  follows: 

Ufinai(m  x  r)  =  Qu(m  x  k)U(k  x  r)  (12) 

V final (r  xn)  =  S(r  x  r)(Qv(n  x  k)V'(k  x  r))'  (13) 


Note  that  QR  factorization  and  SVD  are  applied  on  matrices  with  much  smaller  dimensions. 
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III.  Numerical  Examples 

In  this  Section,  we  will  present  some  numerical  examples  to  show  the  accuracy  and  efficiency  of  the 
new  algorithm. 

A.  Selection  of  the  sample  numbers 

An  EM  related  impedance  submatrix  representing  interaction  between  well  separated  groups  is  employed 
here  for  the  study  of  the  selection  of  the  sample  numbers. 

The  size  of  the  impedance  matrix  is  280  x  280,  and  its  effective  rank  is  8  (determined  numerically 
according  to  equation  11). 

The  relative  errors  (the  ratio  of  the  Frobenius  norm  of  the  difference  matrix  to  the  original  matrix)  as 
a  function  of  sample  numbers  are  shown  in  Figure  1. 


Fig.  1.  Relative  errors  as  a  function  of  sample  numbers.  The  horizontal  axis  is  the  number  of  samples,  and  the  vertical  axis  is  the  relative 
error. 


From  the  Figure,  we  can  see  that  the  relative  error  is  in  the  order  of  1 0  5  when  the  number  of  samples 
is  3  times  of  the  effective  rank.  The  relative  error  is  in  the  order  of  10-4  when  the  number  of  samples  is 
just  twice  of  the  effective  rank,  this  should  be  good  enough  for  most  applications. 

B.  Accuracy  of  the  pseudo  skeleton  approximation 

To  test  the  accuracy  of  the  algorithm,  we  generate  a  random  complex  low-rank  matrix.  The  size  of  the 
matrix  is  1200  x  1200,  and  its  rank  is  10.  The  real  and  imaginary  part  of  the  matrix  are  shown  in  Figure 
2. 

30  randomly  chosen  columns  and  rows  are  used  to  obtain  its  pseudo  skeleton  decomposition.  The 
differences  between  the  reconstructed  matrix  and  the  original  one  are  shown  in  Figure  3.  Clearly,  we  can 
see  that  the  pseudo  skeleton  approximation  algorithm  performs  excellent. 

C.  RCS  of  a  rectangular  PEC  plate 

When  the  pseudo  skeleton  approximation  method  is  applied  for  electromagnetic  scattering  problem,  as 
mentioned  before,  the  computational  domain  is  first  broken  into  a  lot  of  subgroups  at  different  levels  like 
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(b) 

Fig.  2.  A  random  complex  low-rank  matrix.  The  size  and  rank  of  the  matrix  are  1200  x  1200  and  10,  respectively.  (a)Real  part;  (b)Imaginary 
part. 


Fig.  3. 


(b) 


Difference  between  the  reconstructed  matrix  and  the  original  one.  (a)Real  part;  (b)Imaginary  part. 
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Fig.  4.  Typical  rank  map  of  an  EM  problem.  The  target  here  is  a  8A  x  8A  rectangular  plate,  and  the  computational  domain  is  divided  into 
16  subgroups  (2  levels).  The  sizes  of  all  the  submatrices  are  around  1200  x  1200. 

in  MLFMA.  In  each  level,  those  submatrices  representing  the  interactions  of  well-separated  subgroups 
are  compressed  via  the  pseudo  skeleton  approximation.  Then  the  equations  are  solved  by  the  conjugate 
gradient  method. 

As  an  example,  a  typical  rank  map  of  a  rectangular  plate  is  shown  in  Figure  4,  where  the  computational 
domain  is  broken  in  2  levels.  From  the  Figure  we  can  see  that  there  are  a  lot  of  low-rank  submatrices,  and 
the  ranks  are  very  small  compared  to  the  their  dimensions  (around  1200  x  1200).  And  those  submatrices 
associated  with  the  neighboring  subgroups  (blank  blocks  in  Figure  4  can  be  further  divided  in  higher 
levels.  Therefore,  a  lot  of  memory  and  CPU  time  can  be  saved. 

To  show  the  performance  of  the  pseudo  skeleton  approximation,  a  rectangular  PEC  plate  is  considered 
here  for  the  electromagnetic  scattering  analysis.  The  size  of  the  plate  is  4A0  x  4A0,  where  A0  is  the 
wavelength  in  free  space.  The  incident  angles  are:  9i  =  45°,  and  o%  =  0;  and  the  scattering  angles  are: 
6S  =  45°,  and  os  varies  from  0  ~  360°.  The  associated  radar  cross  sections  (RCS)  are  shown  in  Figure  5. 

MoM  results  are  also  presented  as  reference  solutions.  Four  different  curves  are  shown  in  each  Figure. 
We  can  observe  that  the  low-rank  approximation  results  of  both  co-  and  cross-polarizations  agree  with 
the  MoM  solutions  very  well. 

D.  RCS  of  rough  sea  surface 

The  electromagnetic  scattering  analysis  of  an  example  rough  sea  surface  is  presented  in  this  Section. 
The  sea  surface  used  in  this  example  is  based  on  the  Pierson-Moskowitz  model.  The  schematic  of  the  sea 
surface  generation  is  shown  in  Figure  6. 

The  Pierson-Moskowitz  spectrum  is  defined  as  follows: 

S(u)  =  ^exp[-/3(— )]  (14) 

u>°  u 

where  u  =  2nf,  a  =  8.1  x  1CT3,  (3  =  0.74,  u0  =  g/U\g.5.  f  is  the  frequency  of  the  electromagnetic 
wave,  U 1 9  5  is  the  wind  speed  at  a  height  of  19.5m  above  the  sea  surface,  and  g  is  the  acceleration  of 
gravity.  An  example  rough  sea  surface  with  rms  height  of  0.1m  (equivalent  wind  speed  is  4.33m/s)  is 
shown  in  Figure  7.  In  this  example,  the  frequency  of  the  incident  wave  is  300 MHz,  hence  the  roughness 
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(a) 


(b) 

Fig.  5.  RCS  of  a  rectangular  PEC  plate:  MoM  vs.  pseudo  skeleton  low-rank  decomposition.  (a)Co-polarization;  (b)Cross-polarization.  The 
horizontal  axis  is  the  observation  angles,  and  the  vertical  axis  is  the  radar  cross  section 
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Fig.  6.  Schematic  of  rough  sea  surface  generation. 
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Fig.  7.  An  example  rough  sea  surface  with  rms  height  of  0.1m. 


of  the  sea  surface  is  quite  small  in  wavelength.  Therefore,  the  reflection  around  the  forward  direction  is 
still  very  strong,  as  expected. 

The  size  of  the  sea  surface  is  10A0  x  10A0.  The  incident  elevation  angles  are  0,  =  —75°  and  fa  =  0.  To 
reduce  the  edge  effects,  a  tapering  function  is  employed  for  smoothing.  The  bistatic  RCS  of  1  realization 
is  shown  in  Figure  8. 

E.  Complexity  of  the  algorithm  compared  with  MLFMA 

To  show  the  performance  of  the  pseudo  skeleton  approximation,  here  the  CPU  time  for  calculate  one 
matrix-vector  product  is  compared  with  the  counterpart  of  MLFMA. 

Five  cases  with  different  number  of  unknowns  are  considered  by  both  MLFMA  and  pseudo  skeleton 
approximation,  and  the  CPU  times  are  shown  in  Figure  9.  As  a  reference,  a  line  proportional  to  N  log  Ar 
is  also  presented.  From  the  Figure,  one  can  observe  that  the  algorithm  based  on  the  pseudo  skeleton 
approximation  is  about  10  times  faster  than  MLFMA. 

It  should  be  noted  that  there  is  a  overhead  to  compress  all  the  submatrices  associated  with  the  well 
separated  groups.  However,  one  often  need  to  calculate  monostatic  RCS  at  a  lot  of  different  angles  in 
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Fig.  8.  Bistatic  RCS  of  rough  sea  surface.  The  horizontal  axis  is  the  angle  9g,  and  the  vertical  axis  is  the  RCS. 


Number  of  unknowns 


Fig.  9.  Comparison  of  CPU  time  between  MLFMA  and  the  pseudo  skeleton  approximation.  The  horizontal  axis  is  the  number  of  unknowns, 
and  the  vertical  axis  is  the  CPU  time  in  seconds. 


reality.  That  is,  there  will  be  a  large  number  of  right-hand  sides  in  solving  the  Maxwell  integral  equations, 
while  the  impedance  matrix  (the  left-hand  side)  remains  unchanged.  In  these  cases,  the  algorithm  based 
on  the  pseudo  skeleton  approximation  should  perform  much  better  since  the  overhead  will  be  amortized. 

IV.  Discussions  and  Future  Work 

In  this  report,  we  present  a  novel  algorithm  based  on  pseudo-skeleton  approximation  for  fast  elec¬ 
tromagnetic  scattering  analysis.  In  summary,  the  algorithm  is  purely  algebraic  in  nature  and  hence  its 
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implementation  is  integral  equation  kernel  independent.  The  algorithm  first  divides  the  whole  compu¬ 
tational  domain  into  a  lot  of  groups  in  different  levels,  exactly  the  same  as  wha  is  done  in  MLFMA. 
Then  all  the  submatrices  representing  well  separated  groups  are  compressed  by  the  pseudo  skeleton 
approximation.  It  should  be  noted  that  only  partial  impedance  entries  are  needed  for  this  step,  hence  the 
compression  can  be  implemented  efficiently.  Next,  further  compressiones  based  on  SVD  are  employed 
so  that  the  sizes  of  the  decomposed  submatrices  will  be  close  to  their  real  ranks.  Thus,  the  compressions 
are  maximized.  Numerical  examples  show  that  the  method  does  perform  very  well  compared  with  the 
conventional  MLFMA. 

There  are  a  lot  of  work  for  further  exploration  in  this  area.  An  imediate  list  is  as  follows: 

(1) Further  optimize  the  algorithm,  so  the  memory  requirement  and  the  CPU  time  can  be  further 
reduced.  For  example,  check  if  two  neighboring  rank  defficient  submatrices  can  be  merged.  Without 
loss  of  generality,  we  consider  two  neighboring  submatrices  with  size  of  1000  x  1000  and  rank  of  10 
(this  is  a  typical  case  in  EM  problems).  Based  on  the  pseudo  skeleton  approximation,  both  of  them  can 
be  compressed  as  a  produce  of  two  smaller  submatrices  with  sizes  of  1000  x  10  and  10  x  1000.  The 
total  memory  requirement  should  be  2  x  (1000  x  10  +  10  x  1000)  =  40000  for  them.  However,  if  we  can 
integrate  them  together,  and  assume  they  are  neighbored  along  column  direction  and  the  rank  is  still  10, 
then  the  total  memory  requirement  will  be  1000  x  10  +  10  x  2000  =  30000,  which  leads  to  a  25%  saving 
in  both  memory  and  CPU  time. 

(2) When  very  large  targets,  for  example,  very  large  area  of  rough  sea  surfaces,  are  considered,  we 
may  incoporate  the  overlapped  domain  decomposition  method  (ODDM)[?]  with  the  pseudo  skeleton 
approximation.  ODDM  is  a  improved  version  of  the  forware/backward  buffer  region  sweep  method  for 
3D  problems.  The  whole  domain  is  divided  into  a  few  overlapped  subdomains.  The  pseudo  skeleton 
approximation  is  used  for  each  subdomains.  Once  the  current  distributions  of  all  the  overlapped  subdomains 
are  obtained,  an  iterative  procedure  can  then  be  employed  to  get  the  current  distribution  on  the  whole 
domain.  Since  the  spurious  edge  effect  of  each  subdomain  can  be  effectively  reduced  becasue  of  the 
overlapped  regions,  the  convergence  of  the  ODDM  iterative  process  can  be  very  fast  (It  was  shown  that 
the  iterative  process  converges  in  2  or  3  iterations  [20],  [21]). 

Since  each  subdomain  can  be  modeled  separately  in  sequence,  thus  the  memory  requirement  is  actually 
mainly  determined  by  the  size  of  each  subdomain,  not  the  whole  computational  domain.  On  the  other 
hand,  the  current  distribution  on  each  subdomain  can  even  be  saved  on  hard  disk,  and  be  read  later  for  the 
iterative  purpose.  Since  the  iterative  process  generally  converges  in  2  or  3  interations,  the  time  for  reading 
information  from  hard  disk  is  definitely  acceptable.  Thus  large  targets  can  be  easily  handled  without  much 
difficulties. 

(3) The  algorithm  based  on  the  pseudo  skeleton  approximation  is  perfectly  suitable  for  parallelization. 
All  the  subgroups  can  be  easily  assigned  to  different  processors,  and  all  the  associated  submatrices  can 
be  decomposed  locally.  More  important  is  that  no  information  is  needed  to  be  exchanged  among  the 
processors  during  the  compression  step.  Similarly,  the  decomposition  submatrices  U  and  V  can  be  allocated 
to  memory  locally  as  well.  Thus  the  parallelized  version  of  pseudo  skeleton  approximation  method  could 
be  very  efficient,  compared  to  the  parallelized  MLFMA,  where  a  lot  of  information  is  passed  around  all 
the  processors. 

Combinding  the  parallelization  technique,  ODDM,  and  the  pseudo  skeleton  approximation,  it  should 
be  very  promising  to  solve  very  large  problems. 

(4) After  parallelizing  the  algorithm,  larger  targets  will  be  manageable.  On  the  basis  of  that,  we  can 
further  investigate  the  submatrices  representing  the  interactions  between  very  well-separated  groups.  The 
contributions  of  those  submatrices  should  be  very  small  and  therefore  they  can  be  neglected  and  there  is 
no  need  to  do  the  compressions.  Once  again,  a  lot  of  memory  can  be  saved. 
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